(Tummino, et al., 2021)
Drug-induced phospholipidosis confounds antiviral SARS-CoV-2 drug repurposing
Supplementary Materials XXX: Bayesian Infection by PLD regression model

Context

Many drug repurposing screens for anti-SARS-CoV-2 activity have identified cationic amphiphilic drugs (CADs) that are known to disrupt the metabolism and transport of phospholipids in a process call phospholipidosis (PLD) (Breiden & Sandhoff (2019)). Do these drugs inhibit SARS-CoV-2 through phospholipidosis? PLD is a plausible mechanism of action for cell-based viral inhibition because a) SARS-CoV-2 may require access to certain enzymes in the lysosome to prime viral surface proteins for cellular entry, and b) coronaviruses remodel the endoplasmic reticulum into double membrane vesicles as sites of viral replication, which may be disrupted by phospholipidosis. However, PLD as a common mechanism of action for a wide range of drugs is problematic because it limits the shots-on-goal in translating them to the clinic, especially if they were incorrectly prioritized for alternative mechanisms of action. For example, PLD can in some cases cause mild liver toxicity or there is some evidence that among coronaviruses SARS-CoV-2 access to lysosomal enzymes may be dispensable for infection. Thus is a pressing question if CADs inhibit SARS-CoV-2 infection through PLD or not.

Bayesian Regression Workflow

Given a given dataset of compounds, each measured for inhibition of SARS-CoV-2 infection and PLD at a range of doses, we will follow a Bayesian analysis workflow (Gelman et al. (2020), Van de Schoot et al. (2020)), to build a series of regression models using tools from the Stan ecosystem (Carpenter et al. (2017), Bürkner (2017), Vehtari et al. (2017), Gabry & Mahr (2017), Kay (2018), Wickham (2016), Wickham et al. (2019), Team & Others (2013)). For each model we will,

  1. Define and fit a probabilistic model, which combines a prior distribution over a set of parameters with the data to draw samples from posterior distribution over the parameters using Hamiltonian Markov Chain Monte Carlo.
  2. Check for sampling convergence.
  3. Use prior and posterior predictive checks to evaluate the model specification and fit.
  4. Use cross validation to evaluate the generalizability of the model.
  5. Assess inferences that can be made from the model.

Then, we will compare the models based on their fit of the data and inferences that can be made.

Load data

First we will Load data from GraphPad Prism data file

pzfx_fname <- "raw_data/Fig3C_correlations_and_SI_Fig8.pzfx"
pzfx::pzfx_tables(pzfx_fname)
##  [1] "Amiodarone_transformed"         "Amiodarone_transformed_b2"     
##  [3] "Sertraline_transformed"         "Elacridar_transformed"         
##  [5] "Chlorpromazine_transformed"     "HCQ_Transformed_b2"            
##  [7] "PB28_Transformed"               "PB28_051_transformed"          
##  [9] "PB28_061_transformed"           "PB28_061_transformed_b2"       
## [11] "Clemastine_transformed"         "Benztropine_transformed"       
## [13] "Bix01294_transformed"           "U 18666A_transformed"          
## [15] "Ellipticine_transformed_b2"     "Diphenhydramine_transformed_b2"
## [17] "Olanzapine_transformed_b2"      "Melperone_transformed"         
## [19] "Ebastine_transformed_b2"        "All_correlation"

In this file we have infection and PLD data for individual compounds tables 1-19 and compiled set of compounds in table 20. Each table summarizes the mean, standard deviation, and number of measurements of a treatment (drug, dose) on PLD amount as log-fold change relative to Amiodarone, and percent SARS-CoV-2 infection normalized to the buffer DMSO. Note that the PLD and infection are these are separate experiments in comparable cell systems (A549 vs. A549 with ACE2 over-expressed). Further experimental details are available in the main text.

data <- dplyr::bind_rows(
  load_table(pzfx_fname, 8, "ZZY-10-051"),
  load_table(pzfx_fname, 9, "ZZY-10-061"),
  load_table(pzfx_fname, 15, "Ellipticine"),
  load_table(pzfx_fname, 19, "Ebastine"),
  load_table(pzfx_fname, 20)) %>%
  dplyr::filter(!drug %in% c("Elacridar", "Melperone")) %>%
  dplyr::mutate(drug = ifelse(drug == "Amiodarone", "Amiodarone_b1", drug))

#### ADD BATCH 3 DATA ####
pld_batch3_fname <- "raw_data/PLD_Batch3_pooled_analysis.pzfx"
infection_batch3_fname <- "raw_data/raw_viral_RNA_normalized_for_matt.pzfx"

load_pld_batch3 <- function(fname, table,  drug_name=NULL) {
  data <- pzfx::read_pzfx(fname, table = table) %>%
    dplyr::rename(dose = `Drug concentration (mM)`) %>%
    dplyr::mutate(dose = dose * 1e-6) %>%
    dplyr::mutate(dose = ifelse(dose == 1e-12, 0, dose)) %>%
    dplyr::select(dose, tidyselect::starts_with("Efficacy")) %>%
    tidyr::pivot_longer(
      cols = tidyselect::starts_with("Efficacy"),
      names_to =c("measurement", "replica"),
      names_sep = "_") %>%
    dplyr::group_by(dose) %>%
    dplyr::summarize(
      pld_mean = mean(value/100),
      pld_sem = sd(value/100) / sqrt(3),
      pld_n = dplyr::n())

  if(!is.null(drug_name)){
    data <- data %>% dplyr::mutate(drug = drug_name, .before = 1)
  } else {
    data <- data %>%
      dplyr::rename(drug = ROWTITLE) %>%
      dplyr::mutate(drug = drug %>% stringr::str_trim())
  }
}

infection_batch3 <- readxl::read_excel("raw_data/Baseline-corrected of RT-qPCR_new_PLD_set.xlsx") %>%
  dplyr::select(-`Baseline-Control`) %>%
  dplyr::rename(dose = `[Drug] M`) %>%
  tidyr::pivot_longer(
    cols = -dose,
    names_to = c("drug", "measurement"),
    names_sep = "_") %>%
  dplyr::mutate(drug = drug %>% stringr::str_replace(" \\(.*\\)$", "")) %>%
  dplyr::mutate(measurement = measurement %>% stringr::str_to_lower()) %>%
  tidyr::pivot_wider(
    id_cols = c(drug, dose),
    names_from = "measurement",
    names_prefix = "infection_") %>%
  dplyr::mutate(infection_n = 3)

amiodarone_batch3 <- load_pld_batch3(pld_batch3_fname, table = 30, "Amiodarone_b3") %>%
  dplyr::mutate(dose = signif(dose, 1)) %>%
  dplyr::full_join(
    data %>%
      dplyr::filter(drug == "Amiodarone_b1") %>%
      dplyr::mutate(drug = "Amiodarone_b3") %>%
      dplyr::mutate(dose = signif(dose, 1)) %>%
      dplyr::select(drug, dose, infection_mean, infection_sem, infection_n),
    by = c("drug", "dose"))

pld_batch3 <- dplyr::bind_rows(
  load_pld_batch3(pld_batch3_fname, table = 31, "Azithromycin"))


data_batch3 <- dplyr::left_join(
  pld_batch3 %>% dplyr::mutate(dose = signif(dose, 1)),
  infection_batch3 %>%
    dplyr::filter( drug %in% c("Azithromycin")) %>%
    dplyr::mutate(dose = signif(dose, 1)),
  by = c("drug", "dose")) %>%
  dplyr::bind_rows(amiodarone_batch3)


data <- data %>% dplyr::bind_rows(data_batch3)

data %>% dplyr::glimpse()
## Rows: 107
## Columns: 8
## $ drug           <chr> "ZZY-10-051", "ZZY-10-051", "ZZY-10-051", "ZZY-10-051",…
## $ dose           <dbl> 0.0e+00, 1.0e-07, 2.0e-07, 5.0e-07, 2.0e-06, 1.0e-05, 0…
## $ pld_mean       <dbl> 0.1372827, 0.1450817, 0.1630400, 0.2686867, 0.5024503, …
## $ pld_sem        <dbl> 0.03794082, 0.04839236, 0.04366250, 0.10680553, 0.07134…
## $ pld_n          <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3…
## $ infection_mean <dbl> 100.000000, 73.451491, 78.413389, 99.236615, 50.291301,…
## $ infection_sem  <dbl> 15.30633596, 8.81195917, 16.62922066, 11.12910648, 7.36…
## $ infection_n    <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3…

add batch 3 data

prospective_pld_batch3 <- dplyr::bind_rows(
  load_pld_batch3(pld_batch3_fname, table = 33, "Duloxetine"),
  load_pld_batch3(pld_batch3_fname, table = 35, "Fluspirilene"),
  load_pld_batch3(pld_batch3_fname, table = 36, "Methdilazine"),
  load_pld_batch3(pld_batch3_fname, table = 38, "Thiethylperazine"),
  load_pld_batch3(pld_batch3_fname, table = 39, "Toremifene"))

prospective_data <- prospective_pld_batch3 %>%
  dplyr::mutate(dose = signif(dose, 1)) %>%
  dplyr::inner_join(
    infection_batch3 %>%
      dplyr::mutate(dose = signif(dose, 1)) %>%
      dplyr::filter(
        drug %in% c("Duloxetine", "Fluspirilene", "Methdilazine", "Thiethylperazine", "Toremifene"),
        !(drug == "Fluspirilene" & dose == 1e-5),
        !(drug == "Thiethylperazine" & dose == 1e-5)),
    by = c("drug", "dose")) %>%
  dplyr::filter(!is.na(dose))
    

data_all <- dplyr::bind_rows(data, prospective_data)

Note that some compounds tested do not substantially cause phospholipidosis and could potentially be excluded from the analysis. These compounds cause less than %25 PLD relative to Amiodarone over all doses tested:

## # A tibble: 0 x 3
## # … with 3 variables: Drug <chr>, Dose (uM) <dbl>, Max PLD % <chr>

Fit Models

Regression parameterization

We will now fit a series of models for infection as a function of PLD. Virus quantification via RT-qPCR capture complex infection dynamics. To facilitate identifying the influence of PLD, we will compare three parameterizations of the the infection, log, sqrt-log, and log-log: This shows that the all three parameterizations are reasonable but that the log-sqrt parameterization stabilizes the variance at the high and low values of infection.

Fit flat regression model

Now we will fit a flat regression model to use a a model for no dependence of pld on infection. We’ll go through in detail how to interpret the plots for this model, so we can re-use them in fitting later models.

model_logsqrt_flat <- brms::brm(
  formula = sqrt(infection_mean) ~ 1,
  prior = c(
    brms::prior(student_t(3, 5, 5),  class = "Intercept"),
    brms::prior(student_t(3, 0, 5),  class = "sigma")),
  data = data,
  iter = 20000,
  cores = 4)
model_logsqrt_flat$name <- "flat"

Here is a summary of the model fit, at this point we’re looking primarily for the quality of the simulation. There are three areas to consider.

  1. The Rhat should be close 1 indicating the chains are not exploring the same regions of parameter space and not getting stuck.
  2. The Bulk and Tail effective sample sizes (Bulk_ESS, Tail_ESS) should be > 1000.
  3. There are no divergences. The NUTs algorithm can detect when the parameter space is difficult to sample.

If these areas are not all satisfied, then either run the simulation for longer, or reparameterize the model. We’ll return below to interpret model specification and parameter estimates.

##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: sqrt(infection_mean) ~ 1 
##    Data: data (Number of observations: 107) 
## Samples: 4 chains, each with iter = 20000; warmup = 10000; thin = 1;
##          total post-warmup samples = 40000
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept     6.30      0.34     5.63     6.96 1.00    33144    25655
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     3.50      0.24     3.07     4.02 1.00    33311    25287
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Check for convergence

Since the Rhat is close to 1 and bulk_ESS and tail_ESS are > 2500, it looks to have converged. To further check convergence we will evaluate the traceplot, which shows the sampled for each parameter and each chain across the MCMC trajectories. We’re looking for convergence across the sampling.

and the rank histogram. The idea here is that if the simulation has converged, then the rank of each sample should be uniform, which can be visually assessed by making a histogram.

Check specification of priors

Well specified priors should incorporate any domain knowledge ranging from uninformative, weakly informative, to strongly informative. Further the extent the inferences are sensitive to the priors should be clear.

Check quality of model fit

To assess the model fit we will use leave-one-out cross validation, that is re-fitting the model for each (drug, dose) and measuring the posterior probability of the held out sample. This is a more reliable way to compare models and estimate how well the model will generalize, which we’ll return to after we defined all the models. To make this more computationally tractable, we use a method called Pareto smoothed importance sampling to approximate it by re-sampling from the posterior distribution. To interpret these scores, we want the the expected log pointwise predictive density for a new dataset (elpd_loo) to be large, and p_loo, the p_loo is the difference between elpd_loo and the non-cross-validated log posterior predictive density to be small. p_loo can be interpreted as the effective number of parameters, and ideally it should be less than the total number of samples (p_loo < N) and less than the actual number of parameters p_loo < p where p is the number of parameters in the model. A rule of thumb for model selection is that all else being equal, the elpd_loo should be twice or four times the standard deviation better to prefer a different model.

An interesting feature of using leave-one-out cross validation, is it can be used detect outliers as pareto_k estimates for each point. If there were any, they would show up as values > .5 or worse here.

## 
## Computed from 40000 by 107 log-likelihood matrix
## 
##          Estimate  SE
## elpd_loo   -286.0 3.9
## p_loo         1.3 0.1
## looic       572.1 7.7
## ------
## Monte Carlo SE of elpd_loo is 0.0.
## 
## All Pareto k estimates are good (k < 0.5).
## See help('pareto-k-diagnostic') for details.

We can also estimate the leave-one-drug-out cross validation, where the question is how well the model generalizes to new drugs.

## 
## Based on 18-fold cross-validation
## 
##            Estimate  SE
## elpd_kfold   -285.9 3.9
## p_kfold         1.1 0.3
## kfoldic       571.8 7.8

A way to visualize if there is model mis-specification is through a posterior predictive checks. The idea is that if the model fits the data, we should have a hard time distinguishing the data from fake data generated from the fit model. In the upper two plots, samples generated from the model are shown in thin blue lines and the actual data is the thick black line. In the upper left plot, the x-axis is the response in this case the transformed percent infection, and in the right, the x-axis has been stretched and compressed so the samples from the model are approximately uniform (they dip down at the edges because of uncertainty in the model). For more details on posterior predictive checks and these plots see (Gabry2019-ra). Below, it shows the average prediction error as a function of the response. For the flat model, the further above or below the mid-line, the worse the error.

Interpret the model

The pairs plot shows if there is any correlation in the among the parameters. This can help interpret the model, or indicate alternative models or paremterizations. This show very little correlation between the intercept (the b_ prefix comes from the brms framework).

## $title
## [1] "Pairs plot: flat model"
## 
## $subtitle

## 
## attr(,"class")
## [1] "labels"

Now we will plot draws from the fitted model on scatter plot of the infection vs PLD data

## Warning: Removed 6 rows containing missing values (geom_point).

Fit linear regression model

Now we will fit a linear regression model for infection_mean by pld_mean.

model_logsqrt_linear <- brms::brm(
  formula = sqrt(infection_mean) ~ log(pld_mean),
  prior = c(
    brms::prior(student_t(3, 5, 5),  class = "Intercept"),
    brms::prior(student_t(3, 0, 5),  class = "sigma"),
    brms::prior(normal(0, 10),  class = "b")),
  iter = 20000,
  cores = 4,
  data = data,
  verbose = FALSE)
model_logsqrt_linear$name <- "linear"

model_logsqrt_linear_all <- brms::brm(
  formula = sqrt(infection_mean) ~ log(pld_mean),
  prior = c(
    brms::prior(student_t(3, 5, 5),  class = "Intercept"),
    brms::prior(student_t(3, 0, 5),  class = "sigma"),
    brms::prior(normal(0, 10),  class = "b")),
  iter = 20000,
  cores = 4,
  data = data_all,
  verbose = FALSE)
model_logsqrt_linear_all$name <- "linear_all"
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: sqrt(infection_mean) ~ log(pld_mean) 
##    Data: data (Number of observations: 107) 
## Samples: 4 chains, each with iter = 20000; warmup = 10000; thin = 1;
##          total post-warmup samples = 40000
## 
## Population-Level Effects: 
##             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept       1.57      0.39     0.81     2.33 1.00    39759    29462
## logpld_mean    -4.11      0.29    -4.67    -3.54 1.00    40920    29746
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     2.05      0.14     1.79     2.35 1.00    40294    29230
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Check for convergence

Check specification of priors

Check quality of model fit

leave-one-out cross validation

## 
## Computed from 40000 by 107 log-likelihood matrix
## 
##          Estimate   SE
## elpd_loo   -229.2  8.0
## p_loo         2.7  0.5
## looic       458.5 16.0
## ------
## Monte Carlo SE of elpd_loo is 0.0.
## 
## All Pareto k estimates are good (k < 0.5).
## See help('pareto-k-diagnostic') for details.

Leave-one-drug-out cross validation:

## 
## Based on 18-fold cross-validation
## 
##            Estimate   SE
## elpd_kfold   -234.3  9.0
## p_kfold         7.8  1.7
## kfoldic       468.6 18.0

Interpret the model

## $title
## [1] "Pairs plot: linear model"
## 
## $subtitle

## 
## attr(,"class")
## [1] "labels"

Now we will plot draws from the fitted model on scatter plot of the infection vs PLD data

## Warning: Removed 6 rows containing missing values (geom_point).

Fit sigmoid2 regression model

model_logsqrt_sigmoid2 <- brms::brm(
  formula = brms::brmsformula(
    sqrt(infection_mean) ~ 10*inv_logit(hill*4/10*(log(pld_mean) - ic50)),
      ic50 + hill ~ 1,
      nl=TRUE),
  prior = c(
    brms::prior(normal(-1, 5), nlpar="ic50"),
    brms::prior(normal(-7, 2), nlpar="hill")),
  inits=function(){
    list(
      ic50=as.array(-1),
      hill=as.array(rnorm(1, -7, 3)))},
  iter=20000,
  cores = 4,
  data = data)
model_logsqrt_sigmoid2$name <- "sigmoid2"
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: sqrt(infection_mean) ~ 10 * inv_logit(hill * 4/10 * (log(pld_mean) - ic50)) 
##          ic50 ~ 1
##          hill ~ 1
##    Data: data (Number of observations: 107) 
## Samples: 4 chains, each with iter = 20000; warmup = 10000; thin = 1;
##          total post-warmup samples = 40000
## 
## Population-Level Effects: 
##                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## ic50_Intercept    -0.84      0.06    -0.95    -0.73 1.00    32683    27015
## hill_Intercept    -5.64      0.61    -6.94    -4.54 1.00    31003    23332
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     2.03      0.14     1.77     2.33 1.00    32601    27870
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Check for convergence

Check specification of priors

Check quality of model fit

Leave-one-out cross validation

## 
## Computed from 40000 by 107 log-likelihood matrix
## 
##          Estimate   SE
## elpd_loo   -229.0  8.8
## p_loo         3.6  0.8
## looic       458.0 17.6
## ------
## Monte Carlo SE of elpd_loo is 0.0.
## 
## All Pareto k estimates are good (k < 0.5).
## See help('pareto-k-diagnostic') for details.

Leave-one-drug-out cross validation

## 
## Based on 18-fold cross-validation
## 
##            Estimate   SE
## elpd_kfold   -235.9 10.6
## p_kfold        10.5  2.9
## kfoldic       471.8 21.1

z <- model_logsqrt_sigmoid2 %>%
  brms::predictive_error(
    newdata = prospective_data)
## Warning in rnorm(.x1, mean = .x2, sd = .x3): NAs produced

## Warning in rnorm(.x1, mean = .x2, sd = .x3): NAs produced

## Warning in rnorm(.x1, mean = .x2, sd = .x3): NAs produced

## Warning in rnorm(.x1, mean = .x2, sd = .x3): NAs produced

## Warning in rnorm(.x1, mean = .x2, sd = .x3): NAs produced

## Warning in rnorm(.x1, mean = .x2, sd = .x3): NAs produced
z <- model_logsqrt_sigmoid2 %>%
  residuals(
    newdata = prospective_data,
    method = "posterior_predict")
## Warning in rnorm(.x1, mean = .x2, sd = .x3): NAs produced

## Warning in rnorm(.x1, mean = .x2, sd = .x3): NAs produced

## Warning in rnorm(.x1, mean = .x2, sd = .x3): NAs produced

## Warning in rnorm(.x1, mean = .x2, sd = .x3): NAs produced

## Warning in rnorm(.x1, mean = .x2, sd = .x3): NAs produced

## Warning in rnorm(.x1, mean = .x2, sd = .x3): NAs produced
prospective_data %>%
  dplyr::bind_cols(as.data.frame(z)) %>%
  dplyr::mutate(infection_est = Estimate*Estimate) %>%
  dplyr::select(-pld_n, -infection_n) %>%
  dplyr::mutate(across(3:11, signif, 3)) %>%
  dplyr::arrange(pld_mean) %>%
  data.frame()
##                drug  dose pld_mean pld_sem infection_mean infection_sem
## 1        Toremifene 1e-07    0.167  0.0202        153.000        29.500
## 2      Fluspirilene 1e-07    0.178  0.0181         89.400        13.700
## 3        Duloxetine 0e+00    0.197  0.0448        100.000        14.300
## 4      Fluspirilene 0e+00    0.197  0.0448        100.000        21.200
## 5  Thiethylperazine 0e+00    0.197  0.0448        100.000         4.740
## 6        Toremifene 0e+00    0.197  0.0448        100.000        11.300
## 7        Toremifene 2e-07    0.204  0.0313         96.100         8.250
## 8  Thiethylperazine 1e-07    0.218  0.0189         81.500         3.950
## 9  Thiethylperazine 2e-07    0.236  0.0211         89.400         3.100
## 10 Thiethylperazine 5e-07    0.260  0.0199         67.500         5.580
## 11       Toremifene 5e-07    0.263  0.0413        109.000        13.300
## 12     Fluspirilene 2e-07    0.278  0.0282         34.700         8.050
## 13 Thiethylperazine 2e-06    0.391  0.0300         33.200         1.730
## 14       Toremifene 2e-06    0.409  0.0255         29.500        13.100
## 15     Fluspirilene 5e-07    0.443  0.0230         12.800         3.040
## 16     Fluspirilene 2e-06    0.460  0.0213          5.460         1.890
## 17       Duloxetine 1e-07    0.539  0.1170         77.100         9.690
## 18       Duloxetine 2e-07    0.784  0.1220         94.000        20.300
## 19       Toremifene 1e-05    0.872  0.0312          2.610         0.359
## 20       Duloxetine 1e-05    0.914  0.0780          0.635         0.149
## 21       Duloxetine 2e-06    0.972  0.0747         44.800         8.880
## 22     Methdilazine 0e+00       NA      NA        100.000        18.300
## 23     Methdilazine 1e-07       NA      NA         51.100        15.100
## 24     Methdilazine 2e-07       NA      NA         29.400         5.840
## 25     Methdilazine 5e-07       NA      NA         21.400         4.410
## 26     Methdilazine 2e-06       NA      NA          1.630         0.291
## 27     Methdilazine 1e-05       NA      NA          0.940         0.134
##    Estimate Est.Error   Q2.5 Q97.5 infection_est
## 1     3.430      2.04 -0.585  7.44       11.8000
## 2     0.685      2.06 -3.390  4.72        0.4690
## 3     1.480      2.05 -2.530  5.51        2.1800
## 4     1.460      2.06 -2.610  5.54        2.1400
## 5     1.460      2.06 -2.560  5.49        2.1300
## 6     1.460      2.05 -2.600  5.43        2.1300
## 7     1.370      2.05 -2.670  5.40        1.8800
## 8     0.820      2.07 -3.200  4.88        0.6720
## 9     1.500      2.06 -2.520  5.55        2.2400
## 10    0.628      2.06 -3.410  4.66        0.3950
## 11    2.890      2.06 -1.140  6.90        8.3400
## 12   -1.390      2.05 -5.420  2.64        1.9300
## 13    0.231      2.06 -3.790  4.26        0.0532
## 14    0.130      2.06 -3.890  4.18        0.0170
## 15   -1.260      2.05 -5.260  2.79        1.6000
## 16   -2.310      2.05 -6.360  1.70        5.3600
## 17    5.010      2.08  0.929  9.10       25.1000
## 18    7.620      2.08  3.560 11.70       58.1000
## 19   -0.115      2.06 -4.140  3.95        0.0131
## 20   -0.764      2.06 -4.810  3.27        0.5840
## 21    5.280      2.05  1.220  9.31       27.8000
## 22      NaN        NA     NA    NA           NaN
## 23      NaN        NA     NA    NA           NaN
## 24      NaN        NA     NA    NA           NaN
## 25      NaN        NA     NA    NA           NaN
## 26      NaN        NA     NA    NA           NaN
## 27      NaN        NA     NA    NA           NaN
## Warning: Removed 6 rows containing missing values (geom_point).

Compare all models

Compare based on leave-one-out cross validation

##          elpd_diff se_diff
## sigmoid2   0.0       0.0  
## linear    -0.2       2.0  
## flat     -57.0       9.6

Compare based on leave-drug-out cross validation

##          elpd_diff se_diff
## linear     0.0       0.0  
## sigmoid2  -1.6       2.8  
## flat     -51.6       9.6
## Method: stacking
## ------
##          weight
## flat     0.083 
## linear   0.000 
## sigmoid2 0.917

Session Info

For reproducibility, here is information about the R version and loaded packages

sessionInfo()
## R version 4.0.4 (2021-02-15)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS High Sierra 10.13.2
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] patchwork_1.1.1    MPStats_0.0.0.9000 bayesplot_1.8.0    brms_2.15.0       
##  [5] Rcpp_1.0.6         pzfx_0.3.0         forcats_0.5.1      stringr_1.4.0     
##  [9] dplyr_1.0.5        purrr_0.3.4        readr_1.4.0        tidyr_1.1.3       
## [13] tibble_3.1.1       ggplot2_3.3.3      tidyverse_1.3.1    plyr_1.8.6        
## [17] knitr_1.32        
## 
## loaded via a namespace (and not attached):
##   [1] readxl_1.3.1         backports_1.2.1      igraph_1.2.6        
##   [4] svUnit_1.0.6         splines_4.0.4        crosstalk_1.1.1     
##   [7] listenv_0.8.0        rstantools_2.1.1     inline_0.3.17       
##  [10] digest_0.6.27        htmltools_0.5.1.1    rsconnect_0.8.17    
##  [13] fansi_0.4.2          magrittr_2.0.1       globals_0.14.0      
##  [16] modelr_0.1.8         RcppParallel_5.1.2   matrixStats_0.58.0  
##  [19] xts_0.12.1           prettyunits_1.1.1    colorspace_2.0-0    
##  [22] rvest_1.0.0          ggdist_2.4.0         haven_2.4.0         
##  [25] xfun_0.22            callr_3.7.0          crayon_1.4.1        
##  [28] jsonlite_1.7.2       lme4_1.1-26          zoo_1.8-9           
##  [31] glue_1.4.2           gtable_0.3.0         V8_3.4.0            
##  [34] distributional_0.2.2 pkgbuild_1.2.0       rstan_2.21.2        
##  [37] abind_1.4-5          scales_1.1.1         mvtnorm_1.1-1       
##  [40] DBI_1.1.1            miniUI_0.1.1.1       xtable_1.8-4        
##  [43] stats4_4.0.4         StanHeaders_2.21.0-7 DT_0.18             
##  [46] htmlwidgets_1.5.3    httr_1.4.2           threejs_0.3.3       
##  [49] arrayhelpers_1.1-0   ellipsis_0.3.1       tidybayes_2.3.1     
##  [52] pkgconfig_2.0.3      loo_2.4.1            farver_2.1.0        
##  [55] sass_0.3.1           dbplyr_2.1.1         utf8_1.2.1          
##  [58] tidyselect_1.1.0     labeling_0.4.2       rlang_0.4.10        
##  [61] reshape2_1.4.4       later_1.1.0.1        munsell_0.5.0       
##  [64] cellranger_1.1.0     tools_4.0.4          cli_2.4.0           
##  [67] generics_0.1.0       broom_0.7.6          ggridges_0.5.3      
##  [70] evaluate_0.14        fastmap_1.1.0        yaml_2.2.1          
##  [73] processx_3.5.1       fs_1.5.0             future_1.21.0       
##  [76] nlme_3.1-152         mime_0.10            projpred_2.0.2      
##  [79] xml2_1.3.2           compiler_4.0.4       shinythemes_1.2.0   
##  [82] rstudioapi_0.13      gamm4_0.2-6          curl_4.3            
##  [85] reprex_2.0.0         statmod_1.4.35       bslib_0.2.4         
##  [88] stringi_1.5.3        highr_0.9            ps_1.6.0            
##  [91] Brobdingnag_1.2-6    lattice_0.20-41      Matrix_1.3-2        
##  [94] nloptr_1.2.2.2       markdown_1.1         shinyjs_2.0.0       
##  [97] vctrs_0.3.7          pillar_1.6.0         lifecycle_1.0.0     
## [100] jquerylib_0.1.3      bridgesampling_1.1-2 httpuv_1.5.5        
## [103] R6_2.5.0             promises_1.2.0.1     gridExtra_2.3       
## [106] parallelly_1.24.0    codetools_0.2-18     boot_1.3-27         
## [109] colourpicker_1.1.0   MASS_7.3-53.1        gtools_3.8.2        
## [112] assertthat_0.2.1     withr_2.4.2          shinystan_2.5.0     
## [115] mgcv_1.8-35          parallel_4.0.4       hms_1.0.0           
## [118] grid_4.0.4           coda_0.19-4          minqa_1.2.4         
## [121] rmarkdown_2.7        shiny_1.6.0          lubridate_1.7.10    
## [124] base64enc_0.1-3      dygraphs_1.1.1.6

References

Breiden, B., & Sandhoff, K. (2019). Emerging mechanisms of drug-induced phospholipidosis. Biol. Chem., 401(1), 31–46.
Bürkner, P.-C. (2017). Brms: An R package for bayesian multilevel models using stan. Journal of Statistical Software, Articles, 80(1), 1–28.
Carpenter, B., Gelman, A., Hoffman, M. D., Lee, D., & others. (2017). Stan: A probabilistic programming language. Journal of Statistical.
Gabry, J., & Mahr, T. (2017). Bayesplot: Plotting for bayesian models. R Package Version, 1(0).
Gelman, A., Vehtari, A., Simpson, D., Margossian, C. C., Carpenter, B., Yao, Y., Kennedy, L., Gabry, J., Bürkner, P.-C., & Modrák, M. (2020). Bayesian workflow. http://arxiv.org/abs/2011.01808
Kay, M. (2018). Tidybayes: Tidy data and geoms for bayesian models. R Package Version, 1(3).
Team, R. C., & Others. (2013). R: A language and environment for statistical computing. Vienna, Austria.
Van de Schoot, R., Veen, D., Smeets, L., Winter, S. D., & Depaoli, S. (2020). A tutorial on using the WAMBS checklist to avoid the misuse of bayesian statistics. Small Sample Size Solutions: A Guide for Applied Researchers and Practitioners; van de Schoot, R. , Miocevic, M. , Eds, 30–49.
Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat. Comput., 27(5), 1413–1432.
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer.
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T., Miller, E., Bache, S., Müller, K., Ooms, J., Robinson, D., Seidel, D., Spinu, V., … Yutani, H. (2019). Welcome to the tidyverse. J. Open Source Softw., 4(43), 1686.